-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental GPU support for Windows #103
Conversation
# The "-undefined dynamic_lookup" is a hack for systems with | ||
# multiple Python installed. If we link a particular Python version | ||
# here, and we import it with a different Python version later. | ||
# likely a segmentation fault. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence seems to be missing the beginning. (pre-existing)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a merge error or I was just typing non-sense. Will fix it at some point.
@mworchel This works fine on my side and I'll merge it. I'm curious about the path tracing issue. What is wrong about it? |
Thanks for merging. Regarding the path tracing, I noticed that whenever I try to do it in GPU mode under Windows, the application will crash (in CPU mode it works fine, however the backprop does not work there). I invested some time and the crash is caused when calling I ran a cuda-memcheck but I think I didn't see any specific errors. I could try to place device sync points after the preceding functions like |
Interesting. The most suspicious candidate for this is the |
I added multiple sync points and interestingly, the call to if (rendered_image != nullptr) {
auto nd = channel_info.num_total_dimensions;
auto d = channel_info.radiance_dimension;
rendered_image[nd * pixel_id + d] += 0.5 /*weight * path_contrib[0]*/;
rendered_image[nd * pixel_id + d + 1] += 0.5 /*weight * path_contrib[1]*/;
rendered_image[nd * pixel_id + d + 2] += 0.5 /*weight * path_contrib[2]*/;
}
if (edge_contribs != nullptr) {
edge_contribs[pixel_id] += sum(weight * 0.5/* path_contrib*/);
} and with this, the forward pass succeeds. I'll try to comment in stuff one after another and pin the error down some more. Maybe you already have some kind of suspicion? "Luckily" it even fails with a 1x1 image, so I can watch one pixel in isolation. EDIT: Commenting in this line (https://github.com/BachiLi/redner/blob/master/path_contribution.cpp#L38) auto bsdf_val = bsdf(material, shading_point, wi, wo, min_rough); triggers the error. Maybe some out of bounds access on the textures? EDIT2: Hmm ok, when I comment out lines https://github.com/BachiLi/redner/blob/master/material.h#L404-L441 (specular component), the forward pass succeeds and I get the correct diffuse colors. EDIT3: When I comment out these lines https://github.com/BachiLi/redner/blob/master/material.h#L435-L440 the whole thing doesn't compile anymore with some template error in a thrust header. Ok, what? |
Ok, so I was able to fix the error (currently only trying the forward pass) by factoring the Fresnel term into two distinct terms, so replacing auto F = specular_reflectance +
(1.f - specular_reflectance) *
pow(max(1.f - cos_theta_d, Real(0)), 5.f); by Vector3 F1 = specular_reflectance;
Vector3 F2 = (1.f - specular_reflectance) * pow(max(1.0 - cos_theta_d, Real(0)), 5.0);
auto F = F1 + F2; I'm not sure what the root cause is, but I'd guess it has something to do with accessing references to temporary objects in the overloaded operators. Although the code looks fine to me. EDIT: The fix is extremely brittle, though. If you move the part |
I am not using any arcane expression template magic here, so I don't think it's the reference problem you mentioned. This is weird to a point that I start to suspect it's a compiler bug in nvcc. |
It definitely looks like something is very wrong. Especially the compiler error when commenting out certain lines is weird. I wouldn't be surprised if might be on nvcc's side, yes. The only stuff in the code that looks remotely suspicious as I'm not sure it's guaranteed to work by the cpp standard is the |
Ok, the compiler errors might be caused by some incremental build stuff going wrong or some other voodoo. It's not as consistently reproducible as I thought. However, I was able to fix the forward pathtracing pass with even less changes. Just replacing // Schlick's approximation
auto F = [...] * pow(max(1.f - cos_theta_d, Real(0)), 5.f); by // Schlick's approximation
auto F = [...] * pow(max(1.f - cos_theta_d, Real(0)), 5.0); so, replacing I cannot believe this is the fix but making sure the I'll test it on one more machine and will send a PR later. |
@BachiLi, I was just reminded of this PR and the discussion when I tried to compile Redner today and faced the same weird compilation error again. After all, you were right! It's a bug in NVCC which should be fixed in CUDA 11: Leaving this here for documentation purposes. |
This PR enables CUDA compilation for Windows and therefore introduces experimental GPU support. I re-organized parts of the
CMakeLists.txt
to have an (arguably) more ordered structure. Pybind support is now added by linking against an exported target instead of using the call topybind11_add_module
(which would interfere with usage ofcuda_add_library
).Pathtracing does still not work. However, albedo and deferred rendering work fine. So does backpropagation, at least for the GPU mode (see #93).
I didn't test the changes on linux or ios, so that's up for testing.